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1 

CHIMERIC MULTIVALENT PROTEIN ANALOGUE 
METHODS OF USE THEREOF 



Description 

5 Background of the Invention 

The Immunoglobulin (Ig) Gene Superfaiai: 
comprised of nxmerous cell surface and sol\i] 
molecules that mediate recognition, adhesioi 
» functions in vertebrates • (Abbas, A-K, at ^: 

10 AND MOLECULAR IMMUNOLOGY, p. 144 (1991)). M< 
the Ig Superfamily have an evolutionary rel* 
and share significant amino acid sequence a] 
structural similarities. (Williams, A. F. a] 
A. N., IMMUNOGLOBULIN GENES, p. 372 (1989)). 

15 criteria for membership within the family a: 
sequence homology with Ig or Ig-related pol: 
domains, which are approximately 70-110 ami) 
residues long, and 2) key structural featur* 
include the polypeptide domains comprised o: 

20 arrangement of two jS-sheets, each made up o: 
five anti-parallel /S-strands of five to ten 
residues. (Abbas, A.K., et al, . CELLULAR AN! 
IMMUNOLOGY, pp. 144-145 (1991)). 

The Ig Superfamily domains are classif 

25 either variable (V) or constant (C) based o 
characteristics of the /3-strands within the 
sandwich. (Abbas, A. K. , et al, , CELLULAR A 
IMMUNOLOGY, p. 144 (1991))- For example, i 
of Superfamily molecules, the immunoglobuli 
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domains are at the amino-terminal ends of separate 
"heavy" (H) and "light" (L) chains, succeeded in the 
polypeptide chains by constant (C) domains. Thus, in 
an immunoglobulin, a V domain is defined as either Vh 
5 or Vl (Figure 1) • In the T cell receptor, V and C 
refer to Ig-like variable and constant domains which 
are comprised of polypeptide a and P chains. In other 
Superfamily molecules, V and C domains may be comprised 
of 7, S, or e chains. 
10 In the Ig Superfamily, the polypeptide chains that 

comprise the V regions, associate to form ligand 
binding sites. For example, in the immunoglobulin 
molecule, the Vh and Vl domains associate to form the 
variable fragment (Fv) region which comprises the 
15 'antibody binding site. The Fv region includes both 

scaffold-like regions, termed framework regions (FRs) , 
and regions of hyper- variability, termed 
complementarity-determining regions (CDRs) . It is the 
CDRs that contribute to the unique antigen specificity 
20 of immunoglobulins. (Abbas, A. K., et al. , CELLULAR AND 
MOLECULAR IMMUNOLOGY , p. 45 (1991)). Under special 
circumstances, the Fv region has been proteolytically 
dissected from its parent Ig to yield a variable-region 
fragment (Fv fragment) that is comprised of two non- 
25 covalently associated domains (Vh*Vl, a heterodimer) . 
This heterodimeric Fv fragment can further dissociate 
into single and Vh domains- (Huston, et al. 

Math. Enzvmol, 203; 46-88 (1991))- 

Recently, protein engineering methods have been 
30 used to link V„ and Vl chains, creating a functional 
single-chain Fv (sFv) , which has a solitary antibody 
binding site that does not dissociate into single 
domains at low concentrations. (Huston, J.S., et al. , 
Proc, Natl. Acad. Sci, USA 85:5879-5882 (August 1988)), 
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In this approach, the genes encoding Vh and domains 
of a given antibody are connected at the DNA level by 
an appropriate nucleotide sequence, and on translation, 
this gene forms a single polypeptide chain with a 
5 peptide linker bridging the two variable domains, 
(Huston, J.S,, et al, . Meth, Enzvmol, . 203:46-88 
(1991)). 

Summary of the Invention 

The present invention relates to a chimeric 

10 immunoglobulin (Ig) Superfamily protein analogue having 
more than one biologically active binding site. 
Hereinafter, the term multivalent will be used to 
describe these multiple binding sites • The chimeric 
multivalent Ig Superfamily protein analogue, 

15 hereinafter referred to as a CHI-protein, or x-pi^o^ein, 
is comprised of one or more polypeptide chains forming 
a i8-barrel domain. A single )8-barrel domain may 
comprise a chimeric protein binding domain with more 
than one binding site. Alternatively, more than one /3- 

20 barrel domain, such as the Vl and Vh jS-barrel domains 
in immvmoglobulins, may combine to form a larger 
concentric jS-barrel domain having more than one binding 
site. 

The jS-barrel domain (s) comprising the binding 
25 regions has amino acid sequence and structural homology 
with variable regions of molecules related to the Ig 
Superfamily of molecules. Specifically, the binding 
sites on the X"P3^otein are comprised of hypervariable 
regions derived from molecules related to the Ig 
30 Superfamily of molecules. The Ig Superfamily includes 
immunoglobulins, cell surface antigens, such as T 
lymphocyte antigens, and cell surface receptors, such 
as immunoglobulin Fc receptors. 
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In a preferred embodiment, the hyper va 
regions are complementarity-determining reg 
derived from the antigen binding sites of 
immunoglobulins. In this embodiment, the m 
5 protein analogue comprises one or more* poly 
chains forming a /J-barrel domain containing 
interspersed between framework regions (FRs 
CDRs define one antigen binding site* This 
protein analogue also has one or more addit 

10 antigen binding sites spliced into the FRs 
barrel domain. 

In one embodiment, the x-pi'otein will 
single polypeptide chain forming a jS-beorrel 
In another embodiment the x-pi^o^^ii^ will co 

15 < single polypeptide chain comprised of two p 
chains, connected by a polypeptide linker s 
distance between the C-terminus of one chai 
terminus of the other chain forming a jS-bar 
In yet another embodiment, the x"P^otein wi 

20 two polypeptide chains with two non-covaler 
associated chains forming a )S-barrel domair 
of the above embodiments, the pol:^eptide c 
to form a jS-barr^ domain with two or more 
sites. 

25 The invention also relates to the amir 

sequences that encode the x^'P^otein , the Dl 
that encode the amino acid residues forminc 
protein, and expression vectors comprising 
of expressing the DNA sequences. The inveni 

30 relates to methods of producing the x-P^^o^t 
The invention further relates to comp( 
comprised of the x"P^o't^^i^s and methods of 
These compositions are comprised of x'^P^O't* 
two biologically active binding sites and i 

35 in a variety of therapeutic and diagnostic 
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These compositions include, but are not limited to, X" 
protein biosensors which undergo a conformational 
change when a ligand is bound to one binding site such 
that the affinity of the second binding site is 
5 modified; and x-P^oteins having one binding site 

reactive with a tissue-specific ligand and the second 
binding site reactive with radioactive ions, radio- 
opaque substances, cytotoxic substances, cytotoxic 
effector cells (e*g. cytotoxic T cells) drugs, or 

10 catalytic substances. These compositions also include 
a "biochip" comprised of a two-dimensional array of 
aggregated x"P3^otein biosensors, such as in a Langmuir- 
Blodgett film, to make a functionalized membrane useful 
for layers of molecular gates for computers and the 

15 like. 

The utility of binding proteins having two 
independent binding sites of different specificity for 
the treatment or control of tumors, virus infected 
cells, bacteria and other pathogenic states has been 

20 recognized (Segal, D.M. and Snider, P.P. / Chem. 
Immunol, 47:179-213 (1989)). Bispecific binding 
proteins have been produced by crosslinking two or more 
dissimilar but intact antibodies with a chemical agent 
(heteroantibodies) ; crosslinking antibody fragments; 

25 linking two single chain antibodies; or fusing a single 
chain antibody to an effector molecule (Segal et al. 
U.S. 4,676,980), Tai, M. , et al. Biochem. 29:8024-8030 
(1990) . 

However, despite considerable application to 
30 medical research, these previous attempts to produce 

bispecific binding proteins suffer from the difficulty 
of complex purifications and large molecular size. For 
example, each binding region of an IgG is at least 50 
kD, so that a conventional crosslinked, bispecific 
35 heteroantibody can range in mass from 100-150 kD to as 
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much as 300 kD, if two intact IgG antibodies are cross- 
linked. Even the single chain antibody has a mass of 
approximately 26 kD so that a single polypeptide chain 
comprising two separate binding regions, each comprised 
5 of a single chain antibody, has a mass of approximately 
50 kD. 

The recombinant-engineered x-pi^oteins of the 
present invention have significant advantages over 
conventionally crosslinked antibodies, or even the 

10 single-chain Fv. The proteins of the present invention 
can be custom-designed to bind specific ligands and 
cell surface receptors with affinity or specificity. 
These custom-designed multivalent binding proteins can 
be smaller and more compact in size than intact hetero- 

15 bispecific antibodies. Fab' antibody binding fragments 
or bispecific sFv-sFv constructs. 

These x^P^o'^^^i^^ also be less immiinogenic 
thereby reducing the likelihood of immune reactions to 
such therapeutic con^ositions. For example, in the 

20 intact immunoglobulin molecule, the bottom loops of the 
variable region are sterically protected from 
recognition resulting in an iramime response. However, 
when disassociated from the protecting constant region, 
as in the Fv, the exposed bottom loop region 

25 potentially becomes antigenic. In a x^P^otein, this 
bottom loop region is no longer exposed and thus 
significantly reduces the likelihood of . an immune 
response. Furthermore, due to their smaller size, the 
X-proteins can have enhanced stability when 

30 administered intravenously as they will be less 

susceptible to proteolysis by endogenous proteases than 
larger, multi-domain proteins. 



Brief Description of t:he Drawings 

The foregoing features and advantages 
invention will be apparent from the follow: 
particular description in the following dr; 
text, 

Figxire 1 is a schematic representatioi 
typical Ig Superfamily molecule, an iiamuno< 
depicting the variable (V) and constant (C} 

Figure 2A is a schematic representati< 
depicting the relative positions of the to] 
and bottom loops (BL) of the folded heavy « 
chains. The TLs typically form the CDRs o3 
the BLs are typically adjacent to the C rec 
within an intact immunoglobulin. The BLs < 
loops suitable for splicing on the second } 

Figure 2B shows an alignment of Vl and 
acid sequences for which three dimensional 
are known (SEQ ID NOS: 1-9 for and SEQ I 
for V„) . The alignment of these sequences 
the structural homology that exists betweei 
especially in the jS-strand regions. Regioi 
sequence corresponding to structural regio) 
iS-strand (IS) , outer-/S-strand (OS) , top lo< 
bottom loops (BL) are identified. The numl 
corresponds to structural position, especi* 
OS and IS regions. 

Figure 3 A is a stereo figure depictint 
of the McPC 603 Fv structure wherein the t< 
one (right side up) structure are superimp- 
bottom loops of the other (upside down) st: 
clarity, only the top loops (ribbons) and 
(line trace) are shown. 
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Figure 3B is a stereo figure depicting the 
alignment of H2 with the N- and C-terminal strands of 
the inverted Fv structure. 

Figures 4A-'4G depict in stereo the positions of 
5 the native CDRs and the CDRs spliced into the McPC 603 
bottom loops to form a additional binding site (x- 
sitey. corresponding native and x-site CDR loops are 
highlighted as ribbons. 

Figure 5 depicts the stereo comparison of the x- 
10 protein comprised of native McPC603 with McPC603 CDRs 
spliced into the BLs of the native McPC603. 

Figure 6 depicts the splice points used when H3 of 
a second McPC603 is spliced into H BL2 of the native 
MCPC603 sFv. The ribbon follows the spliced trace- 
15 ■ : Figure 7 depicts the splice points used when the 
L3 loop is spliced into the L LB2 of the native 
MCPC603. 

Figure 8 depicts the splice points used when the 
HI loop is spliced into H BL4 of the native McPC603 . 
20 Figure 9 depicts the splice points used when the 

LI loop is spliced into L BL4 of the native McPC603. 

Figure 10 depicts the splicing of the H2 loop into 
the C-terminus of Vh of native McPC603. 

Figiire 11 depicts the splicing of the L2' loop 
25 onto the C-terminus of Vj, of native McPC603. 

Figure 12 shows the final amino acid sequence of 
the x-protein comprised of two McPC603. binding sites as 
constructed according to Examples 1 and 2 (SEQ ID NO: 
17) . 

30 Figures 13 A and B show the alignment of consensus 

sequences for the various sequence classes in the Rabat 
et al. compendium with the alignment of sequences of 
known structure for the Vh (Figure 13 A) and (Figure 
13B) chains; (Kabat, E.A. , SEQUENCES OF PROTEINS OF 
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IMMUNOLOGICAL INTEREST Vols* I-III, U.S, Dept. Health 
and Hijunan Services, NIH Pub, No. 91-3242 (1991))- The 
residue preference list in Table 2 is derived from this 
figure. The first group of sequences are as in Figure 
5 2B (SEQ ID NOS: 1-16). The second group of sequences 
are consensus sequences from the groupings in Kabat et 
al. (Figure 13A, SEQ ID NOS: 18-28) and Figure 13B, SEQ 
ID NOS: 31-38) • The third group of sequences are known 
sequences for mouse immunoglobulins 2610 and GLOOP4. 

10 (Figure 13 A/ SEQ ID NOS: 29-30 and Figure 13B, SEQ ID 
NOS: 39-40). The line labeled KABAT indicates the 
boundaries of framework and CDR regions as defined in 
Kabat et al. ("-" indicate alignment gaps). 

Figure 14 shows the sequences of x"P"t®i^ 

15 constructs as constructed according to Examples 2 and 3 
(SEQ ID NOS: 41-44) • 

Figure 15A shows the nucleic acid and amino acid 
sequences of the X"! protein, with double dashes 
indicating the D1.3H1 x-loop insertion in the 26-10 sFv 

20 (SEQ ID NOS: 45 AND 46, respectively) . 

Figure 15B shows the nucleic acid and cimino acid 
sequences of the X"2 protein, with double dashes 
indicating the D1.3H1 and H2 x-loop insertions in the 
26-10 sFv (SEQ ID NOS: 46 and 47, respectively) • 

25 Figure 16A shows the SDS polyacrylamide gel 

electrophoresis of the x"! protein purified by ouabain- 
Sepharose affinity chromatograph • 

Figure 16B shows the SDS polyacrylamide gel 
electrophoresis of the x-2 protein purified by ouabain- 

30 Sepharose affinity chromatography. 

Figure 17 shows the elution profiles of the X"2 
protein and the 26-10 sFv from a Superdex 75 column. 

Detailed Description of t he Invention 
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The present invention relates to a chin, 
multivalent immunoglobulin (Ig) Super family 
analogue, hereinafter referred to as a x-prc 
comprising one or more polypeptide chains fc 
5 barrel domain. The /3-barrel domain, contair 
hypervariable regions (hereinafter called 
complementarity-determining region-like (CDI 
region) ) and structural regions (hrareinaf tei 
framework-region-lilce (FR-like region) ) . Tl 

10 regions define ligand binding sites. Addit; 
X-protein has at least one more ligand bind: 
segment spliced into the FR-like regions of 
barrel domain. 

The Ig Super family molecules show sign; 

15 aaiino acid sequence homology within their fi- 
domains. (Williams, A.F. and Barclay, A.N. 

IMMUNOGLOBULIN GENES p. ^62 (1989) ) . The ai 
sequences of the amino terminal domains are 
variable (V) regions and the more conserved 
20 of the remainder of the chain, termed the c 
region. (Figure 1) (Abbas, A. K. , et. al. ., 
MOLECULAR IMMUNOLOGY p. 45 (1991)). The am 
sequences of both the V and C regions of Su 
molecules are formed on two different polyp 
25 chains, the heavy (H) chain and the light ( 
immunoglobulins, or the a, /3, 7, «, or e ch 
other Ig Superfamily molecules. These chai 
into V and C regions. For example, each H 
immunoglobulin molecule folds into a V„ dom 
30 adjacent C„ domain and each L chain folds i 
domain with an adjacent Cl- Each chain has 
successive constant regions (Figure 1) . 

The V region contains highly variable, 
stretches of sequence called the hypervari; 
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More conserved stretches of sequence are called 
structural regions. In immunoglobulins, the 
hypervariable regions are called complementarity- 
determining regions (CDRs) and the structural regions 
5 are called framework regions (FRs) , Three CDRs of each 
and three CDRs of each Vl combine in a unicpie 
three-dimensional structure to form the antigen or 
ligand binding site* These CDRs determine the ligand 
specificity of the protein. (Abbas, A. K* , et al, . 

10 CELLULAR AND MOLECULAR IMMUNOLOGY p. 143 (1991)) . 
Hereinafter, the hypervariable regions of all Ig 
Superfamily molecules will be called CDR-like and all 
conserved regions will be called FR-like (or CDRs and 
FRs in the specific case of an immunoglobulin 

15 • molecule) . 

The Ig Superfamily members also show significcint 
homology in the structural three-dimensional features 
of their V and C domains. This structural feature is 
known as the Ig-fold. (Williams, A.F. and Barclay, 

20 A.N., IMMUNOGLOBULIN GENES p. 362 (1989)). The Ig-fold 
consists of a sandwich of two j3-sheets constructed from 
anti-parallel iS-strands, each strand containing five to 
ten amino acid residues. The V domain differs from the 
C domain by an extra pair of j8-strands in the middle of 

25 the V domain^ (Williams, A.F. and Barclay, A.N., 

IMMUNOGLOBULIN GENES p. 362 (1989)). For example, the 
V domains or C domains of an immunoglobulin, associate 
in pairs, such as V„«Vl. Each jS-barrel domain, when 
associated in such a pair, forms two concentric )3- 

30 barrels, with the CDR loops connecting anti-parallel )S- 
strands of the inner barrel. 

Members of the Ig Superfamily include, but are not 
limited to. Immunoglobulins, T cell Receptor Complex, 
Major Histocompatibility Complex Antigens, jSj 
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Microglobulin-associated Antigens, T Lymphocyte 
Antigens^ Growth Factor Receptors, and Neural Cell 
Adhesion Molecules (NCAM) • (Williams, A.F- and Barclay,. 
A.N., IMMUNOGLOBULIN GENES p. 362 (1989)). 
5 Each ligand binding site of the x-protein is 

comprised of the CDR-like region derived from molecules 
of the Ig superfamily. For example, a ligand binding 
site could be comprised of the CDRs derived from an 
immunoglobulin molecule whose ligand is an antigen. 

10 Alternatively,, the ligand binding site could be 

comprised of CDR-like regions from a receptor molecule 
such as the T cell receptor whose ligand is the 
antigen-major histocompatibility complex (MHC) molecule 
which, upon binding to its receptor, initiates T cell 

15 ' activation » 

In particular, the present invention relates to 
the immunoglobulins of the Ig Super family* In natural 
immunoglobulins!, the antibody combining site is formed 
by CDRs of the Vh and Vl variable domains within the Fv 

20 (variable region consisting of noncovalently associated 
Vh and Vl or Vh^V^) , as shown in Figure 1. In addition 
to the CDRs determining binding specificity, 
msLihtenance of the tertiary structure is also necessary 
for biological activity. The CDRs are correctly 

25 positioned by the conserved framework regions (FRs) 
within the V regions,, and the V region is further 
stabilized by ' the C regions of the protein. However, 
the minimal naturally-occurring antibody binding site 
is the two chain, non-covalently associated Fv. 

30 Recently, through recombinant protein engineering, 

a single-chain Fv (sFv) has been constructed. In this 
approach, the genes encoding Vh and V^ domains of a 
given antibody are connected at the DNA level by an 
appropriate oligonucleotide linker and, on translation. 



this gene foras a single polypeptide chain v 
peptide linker bridging the two variable dor 
(Huston, J.S. et al. . Math, En 2 vino 1. 203:46- 

Isolated single domain antibodies have 
recently been constructed, comprised of onl\ 
instead of the usual complement of six. (Wc 
et al, , Nature 341: 544-546 (1989)). In soi 
these single domain antibodies (i.e., Vj, or 
binding activities comparable to their parei 
antibodies. 

In a preferred embodiment of the preses 
invention, one of the x-protein binding site 
comprised of a set of CDRs from a mouse mye: 
such as 26-10 or McPC603 (Huston, J.S., et_j 
Enzvmol . 203: 46-S8 (1991). An additional ] 
segment is spliced into the iS-barrel domain 
spliced segment is comprised of CDRs from t) 
a second mouse myeloma protein, which are s] 
the bottom FR loops of the )8-barrel domain, 
to the C-terminal ends of each V domain. 

The structural basis for this inventio: 
explained with reference to Figure 2B (SEQ 
16) , where the typical variable region comp 
an immunoglobulin is noted. There are thre 
within each Vh and which together constit 
complete antibody binding site. These CDRs 
interspersed i)etween FRs such that the ligh 
region is denoted FR1-L1-FR2-L2-FR3-L3-FR4 
heavy chain is denoted FR1-H1-FR2-H2-FR3-H3 
of these V region chains folds into a nativ 
conformation that comprises a double layer 
V domains may be monomer ic, (Vh or Vl) , or '< 
into horaodimers, (V„-V„ or Vl-VJ , or into t 
heterodimeric Fv, (V„-Vl) . Each of these d; 
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constructed as single-chain analogues- In the single- 
chain Fv, these two sequences are connected in tandem 
with a bridging linker to make, for example, Vn-linker- 
Vl (Vh-Vl) or VL-linker-V„ (Vl^Vh) . 
5 In these folded configurations of the V region 

polypeptide, alternating directions of the ^-strands 
loop out at either end as they form anti-parallel 
strands. In each region the immunoglobulin fold has 
loops on top (the binding site is "on top") and on the 

10 bottom (partly in close proximity to constant domains 
for intact H and L chains) • 

On top of each V domain, four loops are present of 
which three are CDRs that contribute to the antigen 
binding site, and on the bottom, four loops are present 

15 that allow the jS-strands to switch directions and fold 
back into the globular domain. These bottom loops eire 
here termed BLl, BL2, BIr3, and BL4, and apply to the L 
and H variable regions, respectively called L BLl-4 and 
H BLl-4 (Figure 4A) . These bottom loops provide 

20 insertion sites for fusion proteins or peptide 

segments, which are further augmented by splicing of 
peptide sequence at the C-terminus of each V region. 
The use of these bottom loops as splice sites permits 
incorporation of alternate binding, catalytic or 

25 effector sites which complement the naturally present 
antigen binding site. 

However/ novel demands are made on variable region 
architecture in order to construct additional binding 
sites oh the bottom of a V domain. In particular, 

30 there is a requirement that the relative orientation of 
the CDR-like loops (CDR or top loop symmetry) for an Fv 
region be reproduced to a reasonable approximation in 
the relative orientation of the bottom loops of the Fv 
(bottom loop symmetry) , Whether or not this symmetry 



wo 93/23537 



PCr/US93/04338 



-15- 

relationship exists is not an obvious question to ask 
and the possibility of top-bottom loop symmetry has not 
been previously recognized. Nonetheless, molecular 
modeling and computational analysis demonstrate that 
5 such an approximation to the CDR-like loop symmetry 
does exist for the bottom loops of the Fv region. 

The superposition of top and bottom loops involves 
two interrelated observations: (1) the two fixed 
endpolnts of each top loop (CDR) must find a 

10 corresponding match with bottom loop endpoints; and (2) 
the polypeptide chain directionality must be maintained 
after bottom loop splicing (i.e., N to C directionality 
of the spliced CDR must be the same as the bottom loop 
that it replaces) • The overall architecture of the Fv 

15 region appears asymmetric because the ends of the light 
and heavy chain jS-barrels are closer together at the 
top than they are at the bottom. However, much of - this 
apparent difference derives from there being one 
quarter of the Fv residues in CDRs on top of the 

20 framework, which fill in space that is open on the 
bottom. 

In fact, there is sufficient correlation between 
top and bottom loop symmetry of the intact Fv region to 
make the multiple binding site molecule possible. This 

25 is discernible if one superimposes the top of one Fv 
(#1) with the bottom of another Fv (#2) ; if #1 and #2 
are the same,' then the correlation helps decide how to 
construct a multivalent Fv, whereas if #1 and #2 are 
different, the additional binding site (x-site) is 

30 distinct from the native Fv binding site, and overall 
one thereby designs a chimeric multivalent Ig 
Superfamily protein analogue which has one or more 
sites of different specificity. 

We will represent the composition of the X"site 

35 with parentheses so that A(B) represents a X"P3="otein 



comprising the CDR-like regions of immunoglc 
Superfamily molecule B built into the X"sit( 
immunoglobulin Superfamily molecule or anal< 
Likewise (Vh(Vl) represents a domain with 
whose loops are derived from the CDR loops » 
domain, consequently the designation A(B) ^ 
denotes a x^Pi^o^^i^ based on the FR and CDR 
having a x^site based on the CDR loops of F 
the x-site loops on the heavy chain of A ar* 
from the CDR loops of the light chain B and 
and the two x domains are linked by a polyp 
linker such that the heavy chain FR doiaain ; 
light chain FR domain. 

As shown in Figure 3A, by applying the 
-procedures to McPC603 for both Fv #1 and #2 
following pairs of loops align: 

1. The TL4 loops (L3 and H3) of Fv#l can 
very closely with the BL2 loops of Fv# 
the alignment of these two sets of loo 
additional alignments become appzorent. 

2. The TLl loops- (LI and HI) are approxim 
superimposed on the respective BL4 , loo 

3. Xn addition, as shown in Figure 3B, th 
the TL2 loops {L2 and H2) are approxiit 
superimposed on the N- terminal and C-t 
strands of the respective jS-barrel don 

In this structural alignment, all of 1 
fundamental design criteria for splicing a 
binding site onto the bottom of V, Fv, or i 
are satisfied: the proper chain direction? 
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superpositions are found for the TL4 - BL2 pair, the 
TLl - BL4 pair, and the ends of the TL2 loops relative 
to the N- and C-terainal strands of the )S-barrel 
region. 

5 This construction is consistent to a good 

approximation with the natural geometry for all CDR 
loops, thereby generating a ligand binding site on the 
bottom of the jS-barrel domain similar to the "source" 
Fv binding site. Similarly, an additional ligand 

10 binding site may be built on the separate non- 

covalently linked V domains of a heterodimeric Fv 
(i.e., V„-Vl). Alternatively, as it has recently been 
shown that the full set of 6 CDRs is not necessary for 
ligand binding, partial binding sites (i.e., fewer 

15 than 6 CDRs) may be assembled on a single V domain. 
Thus, a x-pi^otein could comprise a single )3-barrel 
domain with two ligand binding sites, each binding site 
with as few as one CDR and still exhibit binding 
activity. 

20 H BLl or L BIil loop replacement could also be 

useful, as they are in proximity to the rest of the 
binding site. However, their peptide chain 
directionality is opposite to what would be needed to 
correctly splice in H2 and L2 loops. Nonetheless, such 

25 loops could be devised de novo by design or mutagenesis 
to facilitate such replacement. 

Furthermore, in most Fv regions, the H or L BL3 
loops are located on the sides of the V domains. 
Although these loops are not contiguous with the rest 

30 of the additional binding site, ancillary loops could 
be attached at such sites, thereby providing the 
addition of a "ligand-like" surface feature for 
recognition by the appropriate receptor, thus, forming 
a X"P^o't^i^ with more than two binding sites. A single 
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substitution of only one loop, with an appropriate 
peptide, could provide a means to anchor the binding 
protein to a particular receptor. Moreover, the H or L 
BL3 (or H or L TI.3) loops could provide the means to 
5 crosslink single domain x-proteins int6 aggregate 
sheets to form two dimensional arrays of x-proteins. 

CDR-like region sequences of any of the proteins 
from the Ig Superf amily showing the requisite sequence 
and structural homology can be spliced into the bottom 
10 loops of the source. Fv. It should be noted that CDR- 
like region sequences from other ig Superfamily 
molecules can also replace all or a part of the native 
CDR-like region of the x-protein. Thus a x-protein can 
comprise the CDR-like regions from molecule A, the FR- 
15 • like regions from molecule B and the x-site having CDR- 
like regions from molecule C spliced in. 

As discussed in detail in Examples 1 and 2, using 
the Fv derived from mouse myeloma IgA antibody, McPC 
603, a second McPC 603 binding site can be constructed 
20 on the bottom of the source Fv. Although the Fv region 
of McPC 603 is exemplified, as shown in Example 3, it 
is reasonable to splice in a CDR-like region from other 
Ig Superfamily proteins, due to the sequence and 
structural homology that exists among these proteins. 
25 (Figure 2B (SEQ ID NOS: 1-16) ) - 

When CDR-like sequences are spliced into the lower 
FR-like loops of the source Fv, it is important to 
preserve those FR-like residues that are critical for 
maintaining the proper folding. These critical FR-like 
30 residues are located in the stems of the loops and 

underlying the CDR-like regions. Three criteria govern 
the selection of the proper splice points. First, as 
discussed above, the need to preserve critical FR-like 
residues; second, the desire to incorporate as much of 
35 the CDR-like sequence as possible; and third, the 



wo 93/23537 



PCT/US93/04338 



-19- 

practical need to switch from one backbone to the other 
at points where the alpha carbons of the two chains are 
reasonably well aligned. The last requirement is 
necessary in order for the spliced loops to maintain 
5 their native, biologically active conformation. 

As described in detail in Examples 1 and 2, the 
crystallographically determined coordinates of a mouse 
myeloma protein structure, such as McPC603, are 
visually displayed using a suitable computer graphics 

10 system. This display of the protein in three- 
dimensions, can be spatially rotated, turned or twisted 
as necessary to view the polypeptide chains or peptide 
backbone of the Fv region. Using the graphics program, 
substitutions, additions or deletions can be made to 

15 :the structure- These modifications are energy 

minimized (optimized) to account for steric hindrance, 
bond lengths, bond angles and energy constraints so as 
to maintain the critical tertiary structure necessary 
for biological binding activity. Thus, in a step by 

20 step process involving modification of the polypeptide 
chain and successive cycles of refinement calculations 
for each modification, a three-dimensional model of the 
X-protein, with two distinct binding sites, is built. 
As described in Examples 1 and 2, the model used 

25 was MCPC603 sFv constructed as V„-Vl. However, sFv 

proteins constructed as Vl-V„, V„-Vh, or Vl-Vl are also 
intended to be encompassed by this invention. Two 
chain Fv region© (i-e., non-covalently associated 
and Vl domains, (Vh-Vl) , as well as single domain 

30 antibodies (Vh or V^) are also intended to be 

encompassed by this invention. In addition, any other 
Ig Superfamily molecule having the requisite sequence 
and structural homology are also intended to be 
encompassed by this invention. 
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Alternatively, since the sequence would be known 
for the Ig superfamily CDR-like regions of interest, 
alignment procedures that relate tertiary to primary 
structure are useful to predict successful x-pj^oteins. 
5 Figures 13 A (SEQ ID NOS: 10^16 and 18-30) and B (SEQ ID 
NOS: 1-9 and 31-40) depicts the amino acid residue 
alignment of Fv regions from mouse myeloma antibodies 
and other Ig superfamily molecules. As described in 
detail in Example 3, this type of sequence alignment 

10 allows approximate choice of splice points without 

having a three-dimensional crystallographic structure. 
The tertiary structures of framework regions are highly 
conserved, thus allowing reliable three-dimensional 
models to be predicated on sequence alignment methods. 

15 However, to correctly splice "target" CDRs-like 

regions into the source Fv, some modifications to the 
source FR-like region sequence may be necessary. 
Moreover, it may be desirable to modify the native CDR- 
like sequence which is spliced into the source Fv, or 

20 the source Fv itself, to modify binding affinity or 
specificity. Such modifications are intended to be 
encompassed by the subject invention. 

-For example, one or more amino acid residues can 
be substituted by another amino acid of a similar 

25 polarity which acts as a functional equivalent, 

resulting in a silent alteration. Substitutes for an 
amino acid within the sequence may be selected from 
other members of the class to which the amino acid 
belongs, such as the nonpolar, polar, and positively or 

30 negatively charged amino acids. 

The structure may be modified by deletions, 
additions, substitutions and insertions of one or more 
amino acids which do not substantially detract from the 
desired functional properties of the X"P3='ot:ein. 

35 Naturally occurring allelic variations and 
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modifications are included within the scope of this 
invention so long as the variation does not 
substantially reduce the ability of the x-protein to 
bind its Uganda 
5 Based on the method described in Example 3, a 

X-protein has been partially constructed incorporating 
two loops of an anti-lysozyme monoclonal antibody in 
the 26-10 sFv, as described in detail in Example 4. As 
shown in Figixre 15A and 15B (SEQ ID NOS: 45-48), 

10 constructions have been made which incorporate the HI 
and H2 loops Of the D1.3 anti-lysozyme monoclonal 
antibody in the appropriate x sites at the bottom of 
26-10 sFv, where they are designated HI' and H2'. The 
design process has involved successive addition of x" 

15 loops, so that there are constructions with HI' alone 

(X-1/ Figure 15A) , and HI' + H2' together (x-2. Figure 
15B). In constructing X"! and x-2, the only deviations 
from the corresponding partial constructs described in 
Example 3 (Figure 14, 2610(01.3) sequence) are the 

20 following: (1) the isoleucine at the U-terminal end of 
H2' at Vh residue 112 is a valine in X"2 due to 
limitations imposed by the restriction sites that were 
utilized, and (2) in both X"! and x-2, the linker is 
(Ser4Gly)3Ser which reasonably confers better solubility 

25 properties on the proteins than that noted in Figure 
14, Ala5{Gly4Ser)3. The stepwise incorporation of the 
remaining D1.3 CDRs (H3', LI', L2', and L3') can be 
accomplished by the same procedure. 

As each CDR loop was inserted in the x-site, the 

30 cloned proteins were expressed in E> coli and refolded* 
The recovery of ouabain binding capacity by these x" 
proteins demonstrates that the insertion of these X" 
CDRs was compatible with proper refolding of the 26- 
10(D1.3) sFv polypeptide chain, thereby generating a 
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26-10 digoxin binding site in each case. (Ouabain is a 
digoxin analogue with for association with 26-10 sFv 
of about lO'' irS and ouabain-based affinity 
chromatography is the preferred isolation method for 
5 digoxin binding proteins. 

Thus, experimental results support the rationale 
of inserting CDR loops in place of turns in 6-sheet 
structure or near the C-terminal ends of V-regions- 
The maintenance of V region integrity after insertion 

10 of X"lQops is a fundamental premise for the 

construction of these dual binding site x-proteins, 
The ability to make a biologically-active xsFv protein 
(i.e., a x-protein that retains its binding activity 
due to proper conformation) with 2 inserted loops (here 

15 comprising 24 CDR residues) is a remarkable result, in 
that the phage display manipulation of sFv binding 
sites may be used in entirely novel ways to genetically 
convert these added x-slte residues into virtually any 
desired antigen specificity • These results also 

20 reasonably support the anticipated ability to closely 

mimic a parent antibody combining site once more x-CDRs 
have been incorporated. 

The present invention also relates to the amino 
acid seguences encoding the x-pj^^tein. Once the 

25 optimal splice points for CDR-like region insertion 

have been determined and shown to adequately reproduce 
the native site, the linear amino acid sequence can 
then be deduced. The x-pi^otein, or its modified 
equivalent, can then be made by recombinant DNA methods 

30 or synthesized directly by standard solid or liquid 
phase chemistries for peptide synthesis, or other 
methods well known to one skilled in the art. For 
excimple, the amino acid sequence of the x-protein 
McPC603 (MCPC603) shown in Figure 12 can be synthesized 
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by the solid phase procedure of Merrif ield by 
semisynthesis or enzymatic or chemical combinations of 
appropriately modified blocks of peptides. 

The present invention also relates to nucleic acid 
5 sequences, DNA and RNA, that encode the x-pi^o^ein, and 
expression vectors capable of expressing the X" 
proteins. Preferably, the x-P^oteins of the subject 
invention will be produced by inserting DNA encoding 
the desired amino acid sequence of the x^Pi^otein into 

10 an appropriate vector/host system where it is 

expressed, or the gene may be used in a cell-free 
rabbit reticulocyte ribosomal protein synthesis system, 

A variety of host/ vector systems can be used to 
express the polypeptides of this invention, such as the 

15 one described in Example 4. Primarily, the vector 
system must be compatible with the host cell used. 
Host/vector systems include, but are not limited to, 
the following: bacteria transformed with bacteriophage 
DNA, plasmid DNA or cosmid DNA, microorganisms such as 

20 yeast containing yeast vectors; mammalian cell systems 
infected with virus (e.g., vaccinia virus, adenovirus, 
etc.) or expressing plasmids; insect cell systems 
infected with virus, such as baculovirus or Xenopus 
oocytes injected with DNA encoding the X'Pi^otein. 

25 Other methods well known to one skilled in the art may 
also be used. 

Any of the standard recombinant methods for the 
insertion of DNA into an expression vector can be used. 
The recombinant vector can be introduced into the 

30 appropriate host cells (bacteria, virus, yeast, 
mammalian cells or the like) by transformation, 
transduction or transf ection, depending upon the 
host/ vector system and cultured to express the 
polypeptides of this invention. 
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Although strategies used to produce recombinant 
proteins have often relied upon bacterial expression of 
fusion proteins followed by in vjtro refolding of the 
protein, direct expression arid secretion also may be 
5 used to produce the x-proteins of the subject 

invention. Expressed fusion proteins may require 
purification and possible removal of the leader 
sequence before refolding of the x-protein to recover 
binding activity. However^ this may be accomplished 

10 using routine laboratory procedures. Direct expression 
of the x-protein can potentially produce the protein 
without the leader, but still in need of refolding, 
whereas secretion can ideally produce native protein 
for siibsequent isolation. (Huston, J.S., et al. 

15 Methods Enzvmol. . 203:46-88 (1991))- 

Altematively, fusion proteins may be the end 
product of expression. For example, a cytochrome faj 
tail could be fused to the x-protein (at the gene 
level) , to make a membrane anchor that could hold the 

20 x-protein in a membrane or Langmuir-Blodgett film. A 
tail or leader sequence could also include a specific 
recognition sequence for enzymatic or chemical 
modification. Sucdi an engineered, post-translational 
modification could allow specific incorporation of 

25 moieties such as biotin or phosphoinositol. 

Possible refolding protocols include dilution 
refolding, redox refolding and disulfide-restricted 
refolding. (Huston, . J. S. , et_al^ Methods Enzymol. 
203:46-88 (1991) the teachings of which are hereby 

30 incorporated by reference) . Dilution refolding relies 
on the observation that fully reduced and denatured 
antibody fragments can refold on removal of denaturant 
and reducing agent with recovery of specific binding 
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activity. (Haber, E., Proc. Natl. Acad - Sci. U.S.A., 
53:524 (1964). 

Redox refolding utilizes a glutathione redox 
couple to catalyze disulfide interchange as the protein 
5 refolds into its native state. (Saxena, P. and 

Wetlaufer, D^B., Biochem. , 9:5051 (1971); (Huston, 
Methods Enzvmol. 203:46-88 (1991)). 
Disulf ide-restricted refolding involves initial 
formation of intrachain disulfides in the fully 

10 denatured protein- This capitalizes on the favored 

reversibility of antibody refolding when disulfides are 
kept intact. (Buckley, C. E. , et al. . Proc. Natl. Acad. 
Sci, U.S.A. , 50:827 (1963); Huston> J.S., et al. 
Methods Enzvmol. 203:46-88 (1991)). Disulfide 

15 • crosslinks should restrict the initial refolding 

pathways available to the molecule. For chains with 
the correct disulfide pairing, the recovery of a native 
structure should be favored, while those chains with 
incorrect disulfide pairs must necessarily produce 

20 nonnative species on removal of denaturant. 

Characterization of the multiple ligand binding 
sites requires that both their affinity and specificity 
be determined. Ideally, the measurement of binding 
affinity should use a thermodynamically rigorous 

25 approach such as equilibrium dialysis or 

ultrafiltration. In the absence of such methods, 
agreement between two distinct methods is desirable to 
reinforce the veracity of the data. Ligand binding to 
high affinity binding sites frequently involves very 

30 fast rates of binding and slow rates of dissociation. 
This characteristic makes measurement of their 
association constants amenable to routine immunoassay 
procedures, such as immunoprecipitation, 
radioimmunoassay and ELISA. (Huston, J.S., et al, 

35 Methods Enzvmol. 203:46-88 (1991)). 
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After evaluating the binding activity of a 
particular model x-P^otein, it may be desirable to 
further refine the initial x-protein to enhance its 
biological activity. This refinement may be performed 
5 computationally by additional computerized modeling or 
it may be a "biological" refinement using recently 
developed techniques such as the use of genetically 
engineered bacteriophage to display and/or secrete the 
X-protein and thereby select for an improved 

10 confirmation. (Marks, J.D., et al. , J. Mol. BioJ.. 
222:581-597 (1991))- 

Additional optimizing techniques, as described 
above, may also be used to confer "special" properties 
on the x-protein, such as "humanizing" the x-protein 

15 using human FR-like regions as the scaffold protein to 
reduce the possibility of adverse immunologic reactions 
during therapy. (Daugherty, B. L. , et al . , Nuclejs - 
Acids Res. 19:2471-2476 (1991)). Special properties 
may also include increased solxibility or stability or 

20 improved renaturability or secretion. 

This invention further relates to the method of 
producing the x-protein which includes the steps of 
determining «ie splice points for additional binding 
sites, as described in Examples 2 and 3, determining 

25 the amino acid sequence of the resulting construct, 
deducing the DNA sequence encoding that amino acid 
sequence, inserting the DNA sequence into an 
appropriate expression vector and expressing the 
protein in a suitable host system, refolding the 

30 protein to its biologically active conformation and 
analyzing its biological binding activity. For 
example, in the case of a x-protein with catalytic 
activity, this would include measurement of enzymatic 
properties . ^ 
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The ability to target therapeutic agents in a host 
with antibodies has been a long-term goal of medical 
research. The term host, as used hereinafter, is 
intended to encompass mammalian hosts, including 
5 humans. The most elegant targetable proteins would 

consist of the minimum structures needed for selective 
delivery and effector fiuiction. (Tai, M. , et al, , 
Biochem. 29: 8024-8025 (1990)). Although the utility 
of binding proteins having multiple binding sites of 

10 different specificity has been widely recognized (U.S. 
Patent 4,676,980), these crosslinked antibodies or 
crosslinked antibody fragments suffer from the need for 
complex purification schemes and large molecular size. 
Notwithstanding the construction of the smaller- 

15 sized sFv, the need still exists for a multifunctional 
binding protein with as reduced a size and as compact a 
shape as possible to permit effective therapeutic use. 
The subject invention also relates to use of these X" 
proteins in therapeutic and diagnostic procedures. 

20 These uses include, but are not limited to, in vivo and 
in vitro imaging agents, delivery agents for drugs, 
radioisotopes, and cytotoxic substances. The x-P^ro^^iri 
could also include a binding site for an effector 
molecule, such as an enzyme, growth factor, cell- 

25 differentiation factor, lymphokine, cytokine, hormone, 
anti-metabolitic, or an ion-sequestering sequence such 
as calmodulin. The x-P^^o^ein could further include a 
binding site reactive with antibody-dependent cytolytic 
cells or cytotoxic T cells. Also included are 

30 x-pj^oteins with catalytic or biosensor activity, as 
well as binding sites that facilitate affinity 
purification procedures. 

In some instances, the primary binding site could 
be a high affinity site that targets the protein to 

35 specific cell surface locations, and the secondary site 
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designed to decrease normal receptor activity. A X" 
protein could be designed with one binding site 
comprising a receptor for a cell surface antigen and a 
second binding site comprised of a modified receptor 
5 with diminished affinity for its ligand* Thus the x- 
protein would to bind a cell via the normal binding 
site, leaving the binding site with diminished binding 
activity exposed, thereby resulting in decreased 
receptor activity. For example, a x-pj^otein could be 

10 designed with neural cell adhesion molecule (NCAM) 

variable regions that would modulate cell interactions 
such as contact inhibition of malignant cells, 

A x"P^otsi^ ^siy also be designed to modify, 
enhance or inhibit cell-cell interactions, or 

15 communication- For example, one binding site could be 
designed to target a cancer cell and a second binding 
site could be designed to target an effector cell, such 
as a macrophage, thus binding the maicrophage in a 
manner which results in destruction of the targeted 

20 cell. 

Moreover, a X"P3^o't:ein could mediate phenomena such 
as antibody-dependent cellular toxicity. (Huston, 
J.S., et al. . ProG. Natl. Acad. Sci. USA . 85:5879-5883 
(1988)). For example, a x-P^^^^i^ could be designed 

25 with one binding site comprised of a receptor for a 

cell surface marker protein and a second site derived 
from a receptor for killer T cells. The x-protein 
would bind to the target cell and the secondary site 
would bind the cytotoxic T cell, resulting in 

30 destruction of the target cell. The x-pi^oteins of the 
present invention would exhibit significantly greater 
tissue accessibility or tumor penetration and faster 
pharmacokinetics due to their compact size. 

Recently, it has been shown that an Fv undergoes a 

35 conformational change upon antigen binding, (Bhat, T. 
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N., Nature , 347:483 (1990))- It is reasonable to 
predict, that the x-P^^oteins of the present invention 
would undergo similar conformational changes, due to 
the significant sequence and structural homology with 
5 the native Fv. Although this conformational change is 
modest, it involves the trans la tional movement of the 
V„ and Vl domains relative to each other, such that the 
orientation of the bottom loops can shift. This makes 
the x"P3COtein conformation sensitive to binding at 

10 either the first or second binding site, and requires 
that linkage exists between their binding equilibria. 
It is reasonable to predict that this conformational 
change would enable the x"Protein to act as a 
"molecular switch". 

15 For example, the first site of the x-P^otein may 

be a catalytic antibody combining site (e.g., a site 
that catalyzes the conversion of a "pro-drug" to a 
cytotoxic drug) , while a second site is specific for 
binding to a marker on a cell surface (e.g., a tumor 

20 cell) . If binding to the cell surface epitope induces 
a conformational change in the x'P^^o't.ein such that the 
loops of the catalytic site are brought into optimal 
geometry for efficient catalysis, then the pro-drug 
would be converted to cytotoxic drug directly at the 

25 site of the tumor, so that high cytotoxic levels could 
be maintained at the site while serum levels remained 
low. The tinbound catalyst would have low efficiency 
and, because of its low molecular weight, would clear 
rapidly from circulation. 

30 The X"P^oteins are well suited to applications in 

immunotargeting because of their binding capabilities 
and compact size. The absence of constant domains will 
reduce nonspecific background binding, thus enhancing 
visualation of target tissue by in vivo or in vitro 
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imaging procedures. The use of x-proteins in in vj.tro 
diagnostics would also reduce nonspecific binding 
thereby increasing the accuracy of these assays. 

The x-proteins of the present invention are also 
5 useful to deliver drugs, . cytotoxic substances or 

effector molecules to immunotargeted tissues. One of 
the binding sites of the subject protein may exhibit 
catalytic activity that is triggered by binding of a 
specific ligand to the other binding site. 
10 The invention will be further illustrated by the 

following Examples, which are not intended to be 
limited in anyway. 

Example 1 

rnnfit-TOetinq a Wodel of B walent McPC603 YSFV 

15 The following illustration, which utilizes the 

McPC 603 Fv region (SEQ ID NOS: 1 and 10), for which 
the three-dimensional crystallographic structure has 
been determined, has been chosen in order to assess how 
well a particular set of splice points will generate a 

20 x-site (additional binding site) that recreates the 
parent binding site. This example also reveals the 
general method of building a x-protein model, by the 
specific example of mapping the McPC603 V„ CDRs to V„ 
bottom loop positions, and the Vl CDRs to Vj, bottom 

25 loop positions. Molecular modeling was performed on a 
Silicon Graphics 4D/70GT superworkstation using the 
Biosym programs INSIGHT (to visualize models) , HOMOLOGY 
(to assemble spliced CDR/FR sequences) , and DISCOVER 
(to minimize the energy of the model) (Biosym, Inc., 

30 San Diego, CA) . The HOMOLOGY program was used to 

assemble the structure in five steps, and the DISCOVER 
program was used to minimize the energy of the 
resulting model about the splice points. Although, 
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HOMOLOGY was used in this example, this program is not 
necessary to build the model. Any one of several 
alternative molecular mechanics programs, such as AMBER 
and CHARMM, could be used instead of the DISCOVER 
5 program- 

Two copies of the McPC603 sFv structure were 
superimposed (Figure 3A) : model A had the binding site 
up and the model B had the bottom loops up; the CDR 
loops of B superimposed on the lower loops of A, as 
10 described above- The primed loops (e,g. H2') are those 
spliced into the x-site. 

1- Building the H2' loop and the bridge- (Figures 4A 



30 



25 



20 



15 



and 4B) - The A coordinates for the Vh domain were 
used up to where H2' from structure B splices into 
the C- terminus of structure A and the A 
coordinates of the linker and _ the following Vl 
domain were likewise used. The coordinates of the 
H2' segment were taken from B. Once coordinates 
had been assigned to the residues flanking the 
bridge peptide region, the HOMOLOGY loop search 
algorithm was used to find a 5 residue peptide 
whose flanking 3 alpha carbons overlapped well 
with the 3 amino acids on either side of the gap, 
and whose conformation best fit the surface of the 
V„ domain- The HOMOLOGY program automatically 
connects the peptide bonds at the splice points to 
create the new single chain structure called MDLl. 
It will generally be necessary to rotate side 
groups in the interface between Vh and H2' out of 
the way in order to reduce steric conflicts. One 
residue in H2' had to be changed because simple 
rotation could not eliminate its steric conflicts: 
Phe 137 (position H70)-> Ala. Using the DISCOVER 
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program, the atoms about the splice points were 
moved so as to minimize contributions to the 
energy of the structure due to improper bond 
lengths and geometries. At this point the 
composition of bridge peptide was polyalanine; 
additional computational analysis of the model may 
be used to further assess the optimum length, path 
and composition of the bridge. This new structxire 
(with H2' and the bridge) is called MDIrl* 

Building K3^ and HI' Loops, (Figures 4C and 4D) 
Structure MDIil supplied the coordinates for the 
residues precedingr and following the segment where 
H3' was to be built into the model* The 
coordinates for the H3' segment were taken from 
structure Since HIV was a short segment of 3 
residues that would replace exactly 3 residues of 
MDLl,. HI' was built by side chain substitution. 
This structure is MDL2. 

Building L3'^- (Figure 4E) Structure MDL2 supplied 
the coordinates for the residues preceding and 
following the segment where 1^3^ was to be built 
into the model. The segment for L3' was built 
into the neact stage of the model in a manner 
analogous to the way that MDL2 was built. The 
resulting structure is MDL3, 

Building LI' . (Figure 4F) structure MDL3 supplied 
the coordinates for the residues preceding and 
following the segment where LI' was to be built 
into the model.. The segment for LI' was built 
into the next stage of the model in a manner 
analogous to the way that MDL2 was built. This 
structure is MDL4. 
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5. Building L2', (Figure 4G) Structure MDL4 supplied 
the coordinates for residues preceding the segment 
where L2^ was to be built into the C-terminus of 
the model. The segment for L2' was built into the 
5 next stage of the model in a manner similar the 

way that MDLl was built except that no connection 
(bridge or linker) was made to the C-temninus of 
the L2' segment. In order to stabilize the C- 
terminus of the L2' loop, a disulfide bridge was 
10 constructed between the C-terminus of L2' and the 

segment that derived from the N-terminal strand 
by substituting the Arg in L2' (at the position 
corresponding to L68 in Vl) and the Thr at the 
position corresponding to L5 in Vl with Cys. 

15 6. Refinement* At each stage, atoms making bad 

contacts were rotated out of the way. This often 
required breaking the peptide bond between 
neighboring residues, rotating a backbone bond and 
then reforming the peptide bond. The 

20 functionality for these modeling operations is 

built into the INSIGHT program and other similar 
molecular modeling programs. The resulting 
modified peptide bond was added to the list of 
splice points. Strain in the splice point peptide 

25 bonds was minimized by subjecting the residues on 

either side of the spliced peptide bond to 100 
cycles of steepest descent minimization using the 
DISCOVER molecular mechanics program. The final 
structure was minimized for 1000 steps of steepest 

30 descent minimization without coulumbic 

interactions, followed by 200 steps steepest 
descent, and lOOO steps of conjugate gradient 
minimization with Columbia charge interactions. 
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Figures 4A-4G show the final model of the McPC603 
(MCPC603) Vjf(V„)- Vl(Vl) x-pj^ot^i^^ ^^^^ corresponding 
pairs of loops highlighted with ribbons. The symmetry 
between the top and the bottom of the molecule can be 
5 seen. Each of the 5 pairs of spliced loops are 
highlighted in separate frames. 

A detailed comparison of the conformation of the 
new sites is presented in Figxire 5. The upper view is 
of a first combining site on the top of the molecule 

10 while the lower view is the x-site built onto the 

bottom of the molecule. In the figure, corresponding 
regions of the loops are highlighted and side chain 
heavy atoms are shown. It can be seen that during the 
minimization process, there was some divergence in the 

15 conformation of the backbone and of the side chains 
between these two sites, but there still remains an 
impressive degree of homology between them. This is a 
measure of the congruency of the lower loop stems and 
the corresponding upper loop stems. The observed 

20 homology also suggests that the steric environment 

surrotinding the spliced loops must be similar to that 
of the parent loops, even though it is cleeir that the 
loops of the first site are more exposed than are those 
of the second x^site. 

25 Example 2 

Defining too and bottom loop symmetry and choosing 
splice points > 

Because of the high level of symmetry that exists 
between the Vh and Vl subunits, CDR loops can be 

30 spliced in a "homologous" fashion (Vh top loops onto V„ 
bottom loops, V„(V„), and Vl top loops onto Vl bottom 
loops, Vl(Vl)) or in a "heterologous" fashion (V^ top 
loops onto the bottom of V^, V„(Vl) , and Vh top loops 
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onto the bottom of Vl, Vl(V„)). In addition, there are 
two ways to connect the Vh and Vl regions: 1) linker 
peptide bridging from the C-terminus of V„ to the 
terminus of Vl (Vh - linker - Vl) , or 2) linker peptide 
5 running from the C-terminus of Vl to the N-terainus of 
Vh (Vl - linker - Vh) . There are, therefore, four 
possible types of x~P^otein derived from immunoglobulin 
CDRs« 

If one is splicing the CDRs of one V„ into the 
10 bottom loops of another Vh (as in Example 1) then H2 

would be spliced into the V„ C-terminal strand. The C- 
terminal end of the new H2' segment would end up on the 
outside of Vh region, near its N-terminal strand 
(Figure 3). Thus, if one is splicing heavy chain loops 
15 into the bottom of the heavy chain in a sFv connected 
in the Vh - Vl order, the C-terminal end of H2' can be 
connected to the peptide that links the heavy and light 
chains. The L2 loop structure can likewise be spliced 
onto the C-terminus of Vl- 

20 Modeling a Vh^V^) - VJV^) Y-Protein based on McPC603 : 
The following discussion presents a detailed 
analysis of how to construct a second McPC603 antigen 
binding site on the bottom of McPC603 to form a homo- 
domain V„-linker-VL x-protein. Although this 

25 exemplification is based on McPC603, the structural 

homology that exists between the framework regions of 
known Fv structures clearly suggests that CDR splicing 
to the bottom loops of Fv regions will work with any 
antibody Fv. The existing structural homology between 

30 known structures permits one to structurally align the 
sequences of these known structures and to establish a 
numbering index wherein the position numbers in the 
strand regions identify unique structural locations. 
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The position scale used to refer to splice points is 
defined in Figure 2B. 

Determining loop splice p oints 

When the CDR loops are spliced into the lower 
5 loops, it is important to preseirv^e those framework 
residues bordering both the top loops and the bottom 
loops which are critical for maintaining the proper 
folding of the /3-barrel structure. In the following 
figures, the spatial alignment of bottom loops with the 
10 superimposed CDR loops are shown along with the segment 
of aligned sequence that derives from, the chosen splice 
points. Three considerations govern the choice of the 
splice points: 

1. Preservation of critical framework residues. 



15 



2, Maximal incorporation of CDR sequences. 
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3. Fusion at closet possible splice points. In order 
for the spliced loops to maintain their native 
structure in the new site, it is important that 
the chains be distorted as little as possible at 
5 the splice sites. Therefore one tries to chose 

splice points as close as possible to the 
positions where the aligned chains are closest 
together, e.g., where corresponding alpha carbons 
are separated by no more than a few angstroms. 

10 Determining the Splice Points for Splicing H3 fH-TL41 
into H-BL2, and L3 fL-TL4) into L-BL2 t 

Figures 6 and 7 show how the CDR 3 regions (H3 and 
L3) would be spliced into H-BL2 and L-BL2, 
respectively. Critical framework residues H-BL2 and L- 

15 BL2 include the Gin at position H39 in H-BL2 and the 
Gin at position L45 in L-BL2- 

Referring to Figure 6, residues position H102 to 
H114 from H3 (positions HlOl to H115) can be spliced 
into H-BL2. The H3 residue at position HlOl is not 

20 included because it would replace the critical 

framework residue at position 39. The H3 residue at 
position H115 is not included because splicing from 
position H114 in the H3 loop to position H47 in the 
framework leads to less distortion of both the 

25 framework and the loop conformations. 

The L3 loop (positions L97 to L104) can be spliced 
into L-BL2 in a similar manner. Referring to Figure 7, 
the new L3' loop includes residues from positions L99 
to L104. The L3 residues at position L97 to L98 are 

30 not included because L98 would replace the critical 

framework residue at position L45, The two chains are 
sufficiently close together at L3 position L104 that 
the entire C-terminal end of L3 can be spliced into the 
new L3' loop. Consequently, only one residue from the 
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N- terminal end of L3 cannot be spliced. In the cases 
of both the H3' and the L3' splices, the backbones of 
the respectively aligned chains lie close enough 
together at the indicated splice points that there is 
5 little distortion of the backbone conformation in the 
X-protein. 

Determining the Splice Points for Splicing HI fH-TLl^ 
into H-BL4. and LI rL*>TLl> into L-BL4 : 

Figures 8 and 9 show how the CDRl regionis would be 
10 spliced into the fourth bottom loops. Because sections 
of both H-BL4 and L-BL4 that contain non-critical 
residues do not align well with the CDRl portion of 
either CDR loop, only portions of the CDRl loops can be 
.spliced. 

15 The HI segment (positions H31 to H35) is 5 

residues long in McPC603 , and of these, the 3 residues 
(positions H31 to H33) that can be spliced are on the 
outer portion of H-TLl, which faces the binding site. 
Position H92 in H-BL4 is a critical residue that is 

20 involved in a conserved intra chain salt bridge, while 
positions H96 to H98 are conserved inner strand 
residues* Consequently positions H31 to H33 are 
spliced between positions H92 and H96 in H-BL4 to 
create Hl*^* 

25 In MCPC603, LI (positions L25 to L4iy is fairly 

long (17 residues) , but only the internal 10 residues 
(position L30 to L39) can be spliced (Figure 9). 
Position L30 is the first place where the Ll loop and 
L-BL4 align.^ Framework positions L93 to L95 are 

30 conserved, inner strand residues that are structurally 
analogous to H96 to H98. In order to preserve 
framework position L93 the Ll loop is spliced at 
position L39. Consequently, 10 residues from the 
McPC603 Ll region can be spliced into L-BL4- In other 
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antibodies having shorter LI regions, only about 3 or 4 
residues can be spliced. 

Determining the Splice Points fo r Splicing H2 fH-TL2) 
into the C-terminal end of VH ; 
5 Figxire 10 shows how the H2 region of the inverted 

copy of MCPC603 aligns with the N-terminal and the C- 
terminal ends of V„. Residues positions H48 to H50 
that flank the N-terminal end of H2 loop region 
(positions H50 to H70) align closely with V„ C-terminal 

10 residue positions H117 to H119, respectively. When 

splicing in the H2' loop, one wants to preserve as much 
of the Vh C- terminus as possible. Based on the 
superposition of the c-terminal and H2' backbones, 
splicing can be done between positions H115 and H119 in 

15 tiie C-terminal strand and between positions H47 and H51 
in N-terminal flanking sequence of H2. Because several 
residues in the c-terminal ' sequence of Vh (and Vl as 
well) are probably important to stabilizing the packing 
of BLl and BL4 , the splice should be made so as to 

20 preserve as much of the C-terminus as possible. As 
shown in Figure 10, therefore, the splice is made 
between position H119 of the V„ C-terminal strand and 
position H51 of the H2' loop. The C-terminal end of 
H2' (arrow Figure 10) is located on the surface of 

25 Vh/ near the N-terminal strand of Vh, when H2' is 

properly positioned relative to H3', L3', HI' and LI'. 
To connect this end of the N-terrainal end of the inter- 
chain linker used to create an single chain Fv (arrow 
B, Figure 10) , a 5 residue bridge peptide must be added 

30 (between points B and A in Figure 10) . In this 

arrangement, the linker-bridge structure folds across 
the bottom of the Vh chain. The preferred bridge 
peptide would contain Bar for solubility and non-polar 
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residues to help anchor it to the surface of V„ so as 
to stabilize the conformation of the H2' fold. In the 
model and in the sequence shown in Figure 12, a 
polyalanine bridge has been used. 
5 Figure 11 shows how the region of -the inverted 

copy of MCPC603 aligns with the N-terminal and C- 
terminal ends of V^,. Residue positions L49 to L57 that 
flank the 1.2 loop xegion (positions L57 to L63) align 
closely with C-terminal residue positions L102 to 

10 1.109. The same considerations are involved in making 
the L2' splice as with the H2' splice. The splice is 
made from C-terminal position L108 to L2 position L56. 
The L2 loop structure folds around until it runs 
parallel to the Vj, M-terminal strand. In order to 

15 stabilize the 1.2' loop structure, we include a 

disulfide bond from the end of the L2' loop to a 
residue in the N-terminal strand of V^. To accomplish 
this the Arg at position L68 in the c-terminal end of 
the 1.2 loop and the Thr at position L5 in the N- 

20 terminal strand of Vl were changed to Cys. 

Figure 12 shows the sequence (SEQ ID NO: 17) of 
the resulting construct with the sequences of the 
primary CDR sites (bold) and secondary CDR sites 
(Underlined) identified. 

25 Example 3 

n.>i-^-rmiT,inq v'-Si t^ Snlice P oin t s From P rimary Sequence 

Alignment 

The members of the Ig Superfamily share a high 
degree of structural homology. By using the homology 
30 that exists between members of known structure, it is 
possible to engineer x-proteins from Ig Superfamily 
molecules whose sequence is known but whose tertiary 
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structure is not yet determined by crystallographic or 
NMR methods. 

Figure 13A (SEQ ID NOS: 10-16 and 18-30) and 13B 
(SEQ ID NOS: 1-9 and 31-40) shows the set of sequences 
5 of Fv domains for which structures are' known and are 
available from the Brookhaven Protein Databank (PDB at 
Brookhaven National Laboratory) . 

Table 1 lists the sequence and structure 
identification names, along with the Kabat et al. 

10 classification and references for the sequences that 

appear in Figures 2B (SEQ ID NOS: 1-16) and 13A (SEQ ID 
NOS: 10-16 and 18-30) and 13B (SEQ ID NOS: 1-9 and 31- 
40) . The alignment of these sequences is based on the 
structxzral homology that exists between them, 

15 especially in the jS-strand regions (OS and IS) • 

Because of this high degree of structtaral correlation, 
one can assign positions in a "consensus 3-D model" to 
residues in a set of aligned jS-barrel sequences* 
Furthermore, at certain positions, there exists a 

20 strong preference for certain residues or for residues 
having certain physical characteristics (Table 2) . 
Thus, while the 3-D structures of most Fv sequences 
remain undetermined experimentally, the residue 
preferences at these positions makes it possible to 

25 reliably align these sequences with those V or Ig 

Superfamily sequences for which the 3-D structures have 
been experimentally determined. Ig V regions can 
thereby be aligned with the sequences of more distant 
members of the immunoglobulin superfamily, such as the 

30 T-cell receptor which was successfully modeled 

according to the McPC603 Fv structure (Novotny, J. et 
al. Proc. Natl. Acad. Sci USA 88:8646-8650 (1991). 
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Once a sequence has been positioned in relation to 
the set of structurally aligned V region sequences, one 
15 can use the splice point analysis deyeloped in Example 
2 to construct a model and delineate a sequence for a 
X-protein by analogy. The procedure for determining 
proper slice points is as follows: 

1. Sequences of undetenained 3-D structure are 
20 lined up with the set of structurally aligned 

sequences. The sequence alignment process makes 
use of the conseryed and semi-conserved residue 
positions listed in Table 1 to fix the alignment 
in certain regions. Between these anchor points 
25 it may be necessary to introduce gaps. Sequences 

in the j8-strand regions (OS and IS) generally line 
up without the need for introducing gaps except in 
the N-terminal section of the light chain variable 
region (L OSl - L BLl) . Relative to all other 
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light chain V region classes, kappa light chain 
class V sequences have a deletion at position L7 
and an insertion at position L17- Lambda light 
chain sequences also have a deletion at position 
5 L7 , but no insertion at position L17 . Kappa light 

chain class VII sequences have a deletion at 
position L23, Other than these gaps in the N- 
terminal section of the light chains, the 
alignment of sec[uences is may make use of 

10 conserved positions to establish anchor points. 

Further help in the alignment of intervening 
sequences is derived from conservation of 
sidechain properties (i.e., polar, non-polar, 
charged, acid ,^ basic, etc) at certain positions. 

15 The other major gaps will exist in the CDR loop 

regions (Hi, LI, H2, L2, H3 and L3). Because CDR 
loops are variable in sequence and length, the 
exact location of gaps in these regions is less 
critical as the sequences in these regions show 

20 little liomolpgy- Structural variation tends to be 

the largest at the center of CDR loops of similar 
sequence, so gaps are preferably positioned in the 
center of CDR top loops. 

2* Once the sequences of one or more V regions 
25 of undetermined structure have been lined up with 

the stru'ctiirally aligned reference set, structure 
positions are assignable to the residues that 
locate in the regions of the inner and outer 0- 
strands. Because the x"site splice points occur 
30 near the ends of the i?-strand regions, they can be 

assigned by analogy with the splice positions 
determined for McPC €03 in Example 2. Given two 
sequences aligned with the reference set, the 
primary sequence is that into whose bottom loops 
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the x-site will be built, and is referred to below 
as the target sequence; while the secondary 
sequence contains the CDR loops which will be 
built into the X"site, and which will be referred 
5 to below as the source sequence. X"site loops are 

created by splicing top loop segments from the 
source sequence in place of bottom loop segments 
in the target sequence. For a V„{V||) -Vl(Vl) X" 
protein construct similar to that designed in 

10 Example 2, the spliceable portions of source CDRs, 

flanked by the fixed top loop jS-strand end points, 
are labelled in Figure 13 as LI' (S) , L2'(S), 
L3(S), Hl'(S) and H3'(S), while the target bottom 
loop segments, flanked by the fixed bottom loop jS- 

15 strand end points, are labelled in Figure 13 as 

Ll'(T), L2'(T), L3'(T), HI' (T) , H2'(T) and H3'(T). 
Thus, to create a X"site L3' loop in the L BL2 
loop of the target sequence, one would splice the 
residues identified as L3'(S) in Figure 13 to the 

20 residues flanking the segment identified as 

L3'(T); similarly, to create a x"site HI' loop in 
the H BL4 loop of the target sequence, one would 
splice the residues identified as H3'(S) in Figure 
13 to the residues flanking the segment identified 

25 as H1'{T), and so forth* These splices are done 

to create a Vh{V„) • Vl(Vl) x"Pi^otein as in Example 
2. To create a Vh(Vl) * Vl(Vh) x-P^^otein L3' loops 
in the H BL2 loop of the target sequence, one 
would splice the residues in the segment 

30 identified as L3'(S) in Figure 13 to the residues 

flanking the segment identified as H3'(T), 
Likewise, to create a x-site HI' loop in the L BL4 
loop of the target sequence, one would splice the 
residues flanking the region identified as HI' (S) 
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in Figure 13 to the residues flanking the segment 
identified as LI '(T), and so forth. 

3* Alternatively, one can build structural 
models for the sequences of undetermined structure 
5 using the coordinates of one or more of the known 

structures as a basis for the framework residue 
locations. Programs such as the Biosym HOMOLOGY 
(Biosym, Inc., San Diego, CA) program are designed 
to aid in this process. The procedure is well 

10 established in the literature (Grear, J. (1991) 

Meth. in Enzvmol. 202:239-252 (1991)). Once 
constructed, models of the parent Fv, sFv or V 
region and of the Fv regions of the corresponding 
X-site can thus be used to determine splice points 

15 and to construct a model of the x"P3^otein in the 

manner set forth in Examples 1 and 2. Building a 
model provides insight into possible steric 
conflicts, particularly, as was noted in Example 
2, at the interface between H2' and the surface 

20 residues of the Fv domain- 

Figure 14 shows the sequence of x^P^^o^^i^ R19.1 
(D1.3) (SEQ ID NO: 42) as it would be constructed 
following par1:s 1 and 2 of the example using the 
aligned sec[uences in Figure 14. The framework and 
25 primciry CDR loops are taken from R19-9 while the 

sequences for the X"site loops were taken from the 
Hl'(S) , K2^iS), H3'(S), Ll'CS)/ and L3'(S) regions of 
D1.3* 

Figure 14 also shows the sequences of x-p^^o^ein 
30 MCP(MCP) (MCPCG03 (MCPC603) (SEQ ID NO: 41)) as it would 
be constructed in Examples 1 and 2. Also shown are 
sequences of x-P^^oteins 26-10(01.3) (SEQ ID NO: 43) and 
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26-10 (GL00P4) (SEQ ID NO: 44) as they would be 
constructed using the method of Example 3 . 



TABLE 2 



Light 
Chain 
Position 


Residue 
Preference 


Heavy 
Chain 
Position 


'Resxdue 
Preference 


LI 


D,(E) 


HI 


E, (D,Q) 


L2 


I 


H2 


V 






H3 


Q,K,H 


L4 


M,L 


H4 


L 


L5 


T 






L6 


Q 


H6 


E(Q) 


L7 


S,T,(-) 


H7 


(T,P,-) 


L8 


P/ (T,E) 










Hll 


(V) 






H12 


V,{M) 






H14 


Pr (A) 


L16 


G(L) 


H15 


G,S 


L17 








Ton 


V/ {A) 




T \7 fT\ 


L21 


T(S) 


H19 


K,R,S 


L22 


I.(M,L) 


H20 




L23 


S.T, (-) 


H21 


S,T 


L24 


C 


H22 


C 


L27 


S:(T) 










H25 


8,1 






H26 


G 






H27 


Y,F, (D,T,-) 






H28 


polar 






H29 


F,I,L 






H30 


T,S, (D) 
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TABLE 2 CONT 



Light 
Chain 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 




Ir w V. (A) 










H36 


W 






H37 


V, (I,M) 






H38 


K^R 






H39 


Q/ (K) 














H41 


Pr (H) 






H42 


G, (E) 






K43 


K,Q,N,R 


L51 


* f \ f * / 


H45 


L 


L52 


K, (R,Q) 


H46 


E, {T» 






H47 


W, {Y,H,D) 






H48 


I,(M,L> 


L54 


L, (W) 






L55 


Ir (V) 


H49 


G, (A) 


L56 


Y, (G,K) 










H51 


I, {V,S) 


L59 


S 










H62 


CT) 


L64 


G 






L65 


V,(I) 


H66 


F/(L,V> 


L66- 


P. (S) 


H67 


K, (Q,M) 


L68 


R 


H69 


K,R, (L) 


L69 


F 






L70 


S,(T) 
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Light 
Chain 
Position 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


L71 


G,(A,V) 


H72 




L72 


S 


H73 


S,T 


L74 


S 


H75 


D, (N) 


L75 


G 






L76 


T 


H77 


S, (T,A,P) 






H78 


polar 






H79 


S,N 






HBO 


(-) 






H81 


A,L,V 






H82 


(F,H) 


L80 


L, (F) 


H83 


L, (M) 






H84 


Q, (D,E,K) 


L82 


I 


H85 


L,M,I 


L83 


polar 


H86 


S,N, (D,R) 


LBS 


V,M,L,(A) 


H88 


L, (V) 


LB 6 








L88 


£ 


H91 


E,(D,A) 


LB 9 


D 


H92 


D 






H93 


T,S 


L91 


A, (G) 


H94 


A, (G) 


L93 


Y,(H) 


H96 


Y 


L94 


Y,F 


H97 


Y,F 


L95 


C 


H98 


C 


L96 


Qr(A,S) 


H99 


A 






H114 


D,A 
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Light 
Chain 
Position 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


L106 






- 


L107 


(P) 


H116 


W 


L108 


G 


H117 


G 






H1X8 


QrA 


LllO 


G 


H119 


G 


Llll 


T 


H120 


T 


L112 


K 






L113 


L, (V) 


H122 


V, (L) 


L114 


E,(T,Q,K) 


H123 


T 


L115 


I, (L,V) 


H124 


V 


Irlie 


K 


H125 


S 



Notes: 1) Single letter amino acid codes; letters in 
parentheses signify less common preferences* 

2) signifies the occurrence of a gap. 

3) Each row corresponds to an equivalent 
5 location in the Vh and Vl chains. 

EXAMPLE 4 

Construction > Expression and Evaluation of a y-Protein 

The x-1 ank x-2 genes (Figure ISA and B, SEQ ID NOS: 
45 and 47) were prepared by mutagenesis of the 26-10 sFv 

10 gene as described in (Huston, J*S., et al. . Proc. Natl. 
Acad. Sci. USA , 85:5879-5883 (1988); Tai^ M.-S., et al^, 
Biochemistry 29:8024-8030 (1990)) and were incorporated 
into the pET vector described in Studier, F.W., and Moffat, 
B.A-, J. Mol. Biol, 189:113 (1986) behind a T7 promoter. 

15 Upon transformation of E. coli with this vector, direct 
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expression produced each in the form of cytoplasmic 
inclusion bodies. Cells were treated with lysozyme to 
allow cell lysis and ultracentrif ugal isolation of the 
inclusion bodies, which were then dissolved in 6 M 
5 guanidinium chloride containing 10 mM dithiothreitol, 25 mM 
Tris, and 10 mM EDTA at pH 8.1; the solution was incubated 
overnight at room temperature. The protein was then 
diluted into 3 M or 3.5 M urea buffer containing 25 mM 
Tris, and 10 mM EDTA, and a glutathione redox couple (1 mM 

10 oxidized, 0.1 mM reduced) at pH 8.1. Following 18 h. at 
4<>C, each was fully renatured by dialysis into phosphate 
buffered saline (PBS), consisting of 0.05 M potassium 
phosphate, 0.15 M NaCl, pH 7.0, and 0.03% NaN3. The X"! 
and X"2 solutions were then passed through ouabain- 

15 Sepharose columns, washed first with PBSA (PBS +0.03% 
NaNj) , then with IM NaCl in PBSA to remove any unbound 
material from the columns, and elution effected by 
displacing specifically bound protein with 20 mM ouabain in 
PBSA. The affinity purified X"l and x-2 proteins were then 

20 examined by SDS polyacrylamide gel electrophoresis. Figure 
16A and B shows the x"l (SEQ ID NO: 46) and x-2 (SEQ ID NO: 
48) polypeptide chains following their affinity isolation. 
In figure 16B, the oxidized (upper band) is compared 
with 26-10 sFv (lower band) in lanes denoted as mixture. 

25 Sequence analysis of the x"2 protein verified that the 
insertions noted in the gene sequence (Figure 15B) were 
present in the protein sequence CNBr fragments of the 
protein were made and sequenced as a mixture in an Applied 
Biosystems 470A gas-phase sequencer equipped with a model 

30 120A on-line analyzer. The gel (Figure 16B) is also 

consistent with the increased molecular weight of the x^'S 
over the 26-10 sFv, Refolded protein that bound to the 
ouabain-Sepharose column has necessarily regained active 
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antibody combining sites for digoxin-like cardiac 
glycosides typified by ouabain. 

Additional insights into the shape and properties of 
the x-1 ai^cl x-2 proteins are apparent from Superdex 75 size 
5 exclusion chromatography of the affinity purified x 

proteins in comparison to the 26-10 sFv, as shown in Figure 
17 • The single HI' insertion of X"! adds two tyrosyl 
residues within the GYGY sequence, resulting in a 
pronounced tendency to dimerize and a very skewed profile 

10 indicative of dissociation into monomer (data not shown) . 
In contrast, data shown in Figure 17 for x-2 (top panel) 
indicate that it appears perfectly behaved in solution, 
devoid of Euiy apparent dimer. Thus, as one incorporates a 
multiplicity of x-"CDRs in V regions, the known 

15 hydrophobicity of CDR sequences apparently is contained by 
the aggregate of x^CDR conformation and interaction. The 
added HI' and H2' loops necessarily increase the protein's 
Stokes radius, resulting in its elution position being 
between the 26-10 monomer and dimer positions (bottom 

20 panel). A mixing experiment that combines both proteins in 
a single chromatographic separation (middle panel) 
indicates that the X"2 shows no appeurent interaction with 
the 26-10 mondmer and dimer species, as the middle profile 
appears to be a simple additive composite of the top and 

25 bottom chromatograms. The peaks beyond 30 minutes 
(horizontal axis is in minutes) are simply injection 
artifacts on the HPLC system, probably due to buffer 
differences between the column and sample. 

Equivalents 

30 Those skilled in the art will recognize, or be able to 

ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the following claims. 
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CLAIMS 

The invention claimed is: 



1. A chimeric multivalent immunoglobulin (Ig) Superfamily 
protein analogue comprising one or more polypeptide 

5 chains forming a ^-barrel domain containing 

complementarity-determining region-like (CDR-like) 
regions and framework region-like (FR-like) regions^ 
said CDR-like regions defining a ligand binding site 
and said protein analogue having at least one 
10 additional ligand binding site segment spliced into 

the FR-like regions of said /3-barrel domain • 

2. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein said polypeptide chains have an 
amino acid sequence wherein said sequence is 

15 substituted or modified in the amino acid sequence of 

at least one amino acid residue, 

3. A Chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein a non-covalently associated two 
chain polypeptide forms a )3-barrel domain. 



20 4. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein a single chain polypeptide forms a 
/3-barrel domain. 

5. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 comprising a single chain polypeptide 
25 forming a j3-barrel domain wherein said single chain 

polypeptide is comprised of two polypeptide chains 
connected by a polypeptide linker spanning the 
distance between the C-terminus of one chain to the N- 
terminus of the other chain. 
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6. A Chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein the polypeptide chain is selected 
from the group consisting of: heavy chain (H) , light 
chain (L) , a chain (a) p chain (jS) , y chain (7), 6 

5 chain (S) , or e chain (e) . 

7. Biological material having a nucleotide sequence which 
encodes a chimeric multivalent Ig Superfamily protein 
emalogue of Claim !• 

8. A replicable recombinant DNA expression vector 
10 containing the nucleotide sequence of Claim 7. 

9. A chimeric multivalent antibody analogue comprising 
one or more polypeptide chains forming a /S-bsurrel 
domain containing complementarity determining regions 
(CDRs) and framework regions (FRs) , said CDRs defining 

15 an cmtigen binding site, said antibody analogue having 

at least one additional antigen binding site segment 
spliced into FRs of said /?-barrel domain. • 

10. A chimeric multivalent antibody analogue of Claim 9 
wherein a non-cdvalently associated two chain 

20 polypeptide forms a jS-barrel domain. 

11. A chimeric multivalent antibody analogue of Claim 9 
wherein a single chain polypeptide forms a -barrel 
domain. 

12. A chimeric multivalent antibody analogue of Claim 9 

25 wherein said CDRs and FRs are comprised of heavy chain 

(H) polypeptide chains and light chain (L> polypeptide 
chains derived from variable regions (V) of 
immunoglobulin proteins. 
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13. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs spliced into FRs of the j3-barrel 
domain to form an additional binding site segment such 
that a variable heavy chain (Vh) CDR is spliced into a 

5 Vh FR to form a Vh(Vh) polypeptide chain. 

14. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into FRs of the /3-barrel 
domain to form an additional binding site segment such 
that a variable light chain (Vl) CDR is spliced into a 

10 Vl FR to form a Vl(Vl) polypeptide chain. 

15. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into the FRs of the jS- 
barrel domain to form an additional binding site 
segment such that a Vh CDR is spliced into a Vl FR to 

15 form a Vl(Vh) polypeptide chain. 

16. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into the FRs of the )3" 
barrel domain to form an additional binding site 
segment such that a Vl CDR is spliced into a V„ FR to 

20 form a Vh(Vl) polypeptide chain. 

17. A chimeric multivalent antibody analogue of Claim 9 
comprising^ a single chain polypeptide forming a 
barrel domain wherein said single chain polypeptide is 
comprised of two polypeptide chains connected by a 

25 polypeptide linker spanning the distance between the 

C-terminus of one chain to the N-terminus of the other 
chain- 
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18. A Chimeric multivalent antibody analogue of claim 17 
wherein said two polypeptide chains connected by a 
linker further comprise two V„(V„) , V^CVl), V„(VJ or 
Vl(Vh) polypeptide chains, 

5 19. A chimeric multivalent antibody analogue of Claim 18 
wherein to the N-terminal end of the polypeptide 
linker spanning the distance between the C-terminus of 
one polypeptide chain to the W-terminus of the, other 
polypeptide chain is added a polypeptide residue 
10 bridge which connect? the N-terminal end of the linker 

to the C-terminal end of a CDR sequence which has been 
added to the C- terminal end. of a FR sequence. 

20. The polypeptide linker and bridge of Claim 19 
comprising at least 19 amino acid residues. 

15 21. A chimeric multivalent antibody analogue of Claim 9 
wherein said CDRs and FRs are of mammalian origin. 

22. A chimeric multivalent antibody analogue of Claim 9 

wherein said CDRs and FRs are of mouse myeloma origin. 

23 • Biological material having a DNA sequence which 
20 encodes the chimeric binding protein of Claim 9. 

24. ^ A replicabie recombinant DNA expression vector 

containing the DNA sequence of Claim 23. 

25. A chimeric multivalent Ig Superf amily protein analogue 
of Claim 1 wherein, upon the binding of a ligand to 

25 one binding site, a conformational change is initiated 

in the j8-barrel domain such that the affinity of the 
second binding site is modified. 
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26. A chimeric multivalent Ig Superfamily protein analogue 
of Claim l wherein an additional polypeptide effector 
molecule having a biological activity is linked to the 
N- or C-terminus of said iS-barrel domain said 
5 biological activity independent of the ligand binding 

activity of the chimeric multivalent protein analogue. 



27. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with a 
diagnostic imaging agent. 



10 28. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with a 
radioisotope. 

29. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive, with a 

15 cytotoxic substance. 

30. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with 
an effector molecule. 

31. A chimeric multivalent Ig Superfamily protein analogue 
20 of Claim 1 wherein one binding site is reactive with a 

marker on a cytotoxic cell. 

32. A method for imaging specific tissue in a host 
comprising: 

a) administering to a host a chimeric multivalent Ig 
25 Superfamily protein analogue of Claim 1 having 

one binding site reactive with a targeted tissue 
specific antigen and a second binding site 
reactive with a diagnostic imaging agent under 
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conditions wherein said protein analogue binds to 
the targeted tissue; and 
b) administering the imaging agent to the host \inder 
conditions whereby said imaging agent binds to 
5 the chimeric multivalent Ig Superfamily protein 

analogue resulting in a detectable image of the 
targeted tissue- 

33,, A pharmaceutical composition for administration to a 
host for imaging specific tissue in a host comprising 
10 the chimeric multivalent Ig superfamily protein 

analogue of Claim 32 in a pharmaceutically acceptable 
carrier* 

34, A method of irradiating specific tissue in a host 
comprising: 

15 a) administering to a host a chimeric multivalent Ig 

Superfamily protein analogue of Claim 1 having 
one binding site reactive with a targeted tissue 
specific antigen and a second binding site 
reactive with a radioisotope under conditions 

20 whereby said protein analogue binds to the 

targeted tissue; and 
b) administering the radioisotope to the host under 
conditions whereby said radioisotope binds to the 
chimeric multivalent Ig Superfamily protein 

25 analogue wherein binding of said chimeric protein 

analogue to targeted tissue and binding of said 
radioisotope to said chimeric protein analogue 
results in irradiation of the targeted tissue. 
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35 • A pharmaceutical composition for administration to a 
host for irradiating specific tissue in a host 
comprising the chimeric multivalent Ig Superfamily 
protein analogue of Claim 34 in a pharmaceutically 
5 acceptable carrier. 

A method of delivering a cytotoxic substance to 
specific tissue in a host comprising: 

a) administering to a host a chimeric multivalent Ig 
Superfamily protein analogue of claim 1 having 
one binding site reactive with a targeted tissue 
specific antigen and a second binding site 
reactive with a cytotoxic substance under 
conditions whereby said protein analogue binds to 
the targeted tissue; and 

b) administering the cytotoxic substance to the host 
under conditions whereby said cytotoxic substance 
binds to the chimeric multivalent Ig Superfamily 
protein analogue, wherein binding of said 
chimeric protein analogue and binding of said 
cytotoxic substance to said chimeric protein 
analogue results in delivering the toxic 
substance to the targeted tissue. 

21. A pharmaceutical composition for administration to a 
host to deliver a cytotoxic substance to specific 
25 tissue in a host comprising the chimeric multivalent 

Ig Superfamily protein analogue of Claim 36 in a 
pharmaceutically acceptable carrier. 



36. 



10 



15 
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38. A method of lysing target cells in a host having 
cytotoxic cells comprising administering to a host a 
chimeric multivalent Ig Super family protein analogue 
of Claim 1 having one binding site reactive with a 

5 surface receptor of a cell targeted * to be lysed, and a 

second binding site reactive vrith a marker on a 
cytotoxic cell under conditions whereby said protein 
analogue binds to the targeted cell and said cytotoxic 
cell binds to the chimeric multivalent Ig Superf amily 
10 protein analogue, wherein binding of said chimeric 

protein analogue cind binding of said cytotoxic cell 
results in lysis of the targeted cell, 

39. A pharmaceutical composition for administration to a 
host a cytotoxic cell which lyses a target cell 

15 comprising the chimeric multivalent ig Superf amily 

protein analogue of Claim 38 contained in a 
pharmaceutically acceptable carrier. 

40. A method of modifying the function of a cell surface 
receptor of specific tissue in a host comprising 

20 administering to a host a chimeric multivalent Ig 

Superfamily protein analogue of Claim 1 having one 
binding site reactive with a targeted cell surface 
receptor and a second binding site reactive with an 
effector molecule under conditions whereby said 

25 protein analogue binds to the targeted tissue and said 

effector molecule binds to the chimeric multivalent Ig 
Superfamily protein analogue, wherein binding of said 
chimeric protein analogue and binding of said effector 
molecule results in selective modification of the 

30 function of the targeted cell surface receptor. 
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41. A pharmaceutical composition for administration to a 
host an effector molecule which modifies the function 
of a cell surface receptor of a specific tissue 
comprising the chimeric multivalent Ig Superfamily 

5 protein analogue of Claim 4 0 contained in a 

pharmaceutical ly acceptable carrier. 

42, The chimeric multivalent Ig Superfamily protein 
analogue of Claim 1 having one binding site reactive 
with a preselected ligand and a second binding site 

10 reactive with a substance labeled with a radioisotope 

or enzyme suitable for use as a quantifying agent in 
cin in vitro diagnostic assay. 

43, The chimeric multivalent Ig Superfamily protein 
analogue of Claim 1 having one binding site reactive 

15 with a preselected ligand and a second binding site 

having catalytic activity. 

44. The chimeric multivalent Ig Superfamily protein 
analogue of Claim 43 wherein both binding sites have 
catalytic activity. 

20 45. The chimeric multivalent Ig Superfamily protein 
analogue of Claims 43 and 44 contained in a 
physiologically compatible carrier solution for 
administration to vertebrates. 

46. The chimeric multivalent Ig Superfamily protein 
25 analogue of Claim 1 crosslinked at sites other than 

ligand binding sites to form a two dimensional array 
of chimeric multivalent protein analogues. 
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47. A method for producing a chimeric multivalent Ig 

Superfamily protein analogue comprising the steps of: 
a) . determining the splice points for CDR-like 

regions to form additional ligand binding site 
5 segments on the FR-like regions of a /3-barrel 

domain whereby insertion of CDR-like region amino 
acid residues into the FR-like region residues 
maintains the folded structure required for 
binding activity with a preselected ligand; 
10 b) determining the amino acid sequence of the 

resulting construct having a first ligand binding 
site and a second ligaJid binding site; 
c) deducing the DNA sequence encoding the amino acid 
sequence of b) ; 
15 d) synthesizing the DNA sequence ; 

e) inserting the DNA sequence into an appropriate 
expression vector and expressing the polypeptide 
in a suitable host system; 

f ) isolating and purifying the expressed 
20 polypeptide; and 

g) refolding the purified polypeptide to its 
immunologically reactive conformation, thereby 
resulting in a chimeric multivalent Ig 
Superfamily protein analogue . 

25 48. The method of Claim 47 wherein determining the splice 
points is accomplished computationally by use of a 
computer-generated three dimensional structure of the 
chimeric multivalent Ig Superfamily protein analogue. 



49- 

30 



The method of Claim 47 wherein determining the splice 
points is accomplished by primary sequence alignment. 
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se. A method for effecting cell-cell interactions 
comprising administering to a host a chimeric 
multivalent Ig Superfamily protein analogue of Claim 1 
having one binding site reactive with a targeted cell 
5 surface receptor of a first cell and a second binding 

site reactive with a targeted cell surface receptor of 
a second cell under conditions whereby said protein 
analogue binds to said first cell and second cell 
wherein binding of said first cell and second cell to 
10 the chimeric protein analogue results in interaction 

between the two cells, 

51. A molecular switch comprising a chimeric multivalent 
Ig Superfamily protein analogue of Claim 1 having one 
binding site initiating a conformational change in 

15 said chimeric protein analogue when said binding site 

is bound to ligand, whereby the conformational change 
causes the chimeric protein analogue to act as a 
molecular switch. 

52. A chimeric multivalent Ig Superfamily protein analogue 
20 according to Claim 1 for use in therapy or diagnosis. 

53. An analogue according to Claim 52 for use in (a) 
imaging specific tissue in a host; or (b) irradiating 
specific tissue in a host; or (c) delivering a 
cytotoxic substance to specific tissue in a host; or 

25 (d) lysing target cells in a host; or (e) modifying 

the function of a cell surface receptor of specific 
tissue in a host; or (f) effecting cell-cell 
interactions in a host. 
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54. Use of a chimeric multivalent Ig Superfamily protein 

analogue according to Claim 1 for the manufacture of a 
diagnostic agent for imaging specific tissue in a 
host. 

5 55. Use of a chimeric multivalent Ig Superfamily protein 

analogue according to Claim 1 for the manufacture of a 
medicament for (a) irradiating specific tissue in a 
host; or <b) delivering a cytotoxic substance to 
specific tissue in a host; or (c) lysing target cells 
10 in a host; or (d) modifying the function of a cell 

surface receptor of specific tissue in a host; or (e) 
effecting cell-^cell interactions in a host. 
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FIG. I7A 



26-10 sFv + X-2 protein 
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